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INTRODUCTION: 


c-Mvc  and  breast  cancer: 

Amplification  of  the  c-myc  gene  has  been  reported  in  as  many  as  20-30%  of 
sporadic  breast  tumors,  and  may  be  associated  with  a  relatively  poor  prognosis  (Escot 
et  al.,  1986;  Watson  et  al.,  1993).  Of  perhaps  even  greater  significance  is  the 
observation  that  constitutive  expression  of  c-Myc  predisposes  mammary  tissue  to 
carcinoma  (Schoenenberger  et  al.,  1988;  Stewart  et  al.,  1984),  as  is  the  case  in  many 
other  cell  lineages  (Leder  et  al.,  1986).  The  potential  implications  of  uncontrolled  c- 
Myc  expression  are  further  illustrated  by  the  finding  that  it  can  allow  cells  to  become 
transformed  without  an  accompanying  mutation  in  the  tumor  suppressor  gene  p53  (Lu 
et  al.,  1992),  abnormalities  of  which  are  associated  with  both  sporadic  and  hereditary 
breast  cancer  (Harris  et  al.,  1992).  An  investigation  of  c-Myc  function  is  therefore 
relevant  to  breast  cancer  not  only  because  of  its  specific  association  with  mammary 
carcinoma,  but  also  because  activation  of  the  cellular  regulatory  networks  in  which  it  is 
involved  seems  in  general  to  contribute  to  oncogenesis.  These  observations  suggest 
that  c-Myc  will  be  a  promising  target  for  development  of  future  antineoplastic  therapies 
which  are  designed  specifically  to  inhibit  its  function. 

To  generate  such  molecularly-based  therapies,  it  will  be  important  not  only  to 
identify  cellular  factors  that  c-Myc  activates  (and  is  activated  by),  but  also  to 
understand  the  specific  molecular  interactions  involved.  In  this  project,  we  have 
begun  to  address  this  last  issue.  Using  structural  information  as  a  guide,  we  are 
identifying  particular  protein-protein  and  protein-DNA  interactions  that  are  essential  for 
the  function  and  target  specificity  of  helix-loop-helix  proteins,  including  c-Myc.  A 
detailed  understanding  of  these  intermolecular  interactions  is  essential  for  an 
understanding  of  c-Myc  biology  and  necessary  for  design  of  therapeutics. 

c-Mvc  as  a  transcriptional  regulator: 

Evidence  suggests  that  c-Myc  is  involved  in  regulating  progression  through  the 
cell  cycle  (Jansen-Durr  et  al.,  1993;  Luscher  and  Eisenman,  1990).  In  the  mouse,  both 
the  c-  and  N-Myc  genes  are  essential  for  development,  but  either  can  be  disrupted 
without  impairing  the  viability  of  individual  embryonic  stem  cells  (Charron  et  al.,  1992; 
Davis  et  al.,  1993;  Moens  et  al.,  1992;  Sawai  et  al.,  1993;  Stanton  et  al.,  1992).  These 
latter  findings  demonstrate  that  Myc  proteins  are  not  parts  of  the  essential  cell  cycle 
machinery,  and  suggest  instead  that  they  transmit  proliferative  signals  to  it.  Indeed,  c- 
Myc  seems  to  interact  with  multiple  cellular  signalling  pathways,  as  is  indicated  by  the 
apparent  complexity  of  its  transformation-inducing  capability  (Lu  et  al.,  1992;  Sawyers 
et  al.,  1992),  and  by  the  observation  that  in  cells  which  have  been  deprived  of  growth 
factors,  expression  of  c-Myc  can  induce  apoptosis  (Evan  et  al.,  1992;  Neiman  et  al., 
1991).  Apparently,  c-Myc  is  part  of  a  regulatory  network  that  induces  apoptosis  if  the 
cell  is  receiving  mixed  or  inappropriate  signals  regarding  whether  to  proliferate  (Shi  et 
al.,  1992).  Thus,  while  over-expression  of  c-Myc  appears  to  contribute  to  cellular 
transformation  by  inducing  proliferation  (Luscher  and  Eisenman,  1990),  the  pathways 
by  which  it  does  so  seem  to  be  complex. 

An  essential  insight  into  how  c-Myc  might  perform  these  functions  has  come 
from  the  realization  that  Myc  proteins  are  members  of  the  basic-helix-loop-helix 
(bHLH)  family  of  DNA-binding  proteins  (Figure  1)  (Davis  et  al.,  1987;  Murre  et  al., 
1989).  In  general,  members  of  this  large  family  are  involved  in  transcriptional 
regulation,  with  some  playing  a  role  in  cellular  differentiation,  and  others  implicated  in 
oncogenesis  (Weintraub  et  al.,  1991).  They  are  defined  by  the  HLH  domain  (Murre  et 
al.,  1989;  Murre  et  al.,  1989),  which  allows  them  to  form  dimers,  and  by  a  region  of 
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basic  amino  acids  (BR)  which  lies  immediately  N-terminal  to  this  domain  and  through 
which  they  bind  to  specific  DNA  sequences  (Davis  et  al.,  1990;  Voronova  and 
Baltimore,  1990).  Myc  proteins  belong  to  a  bHLH  subgroup  (bHLH-ZIP  proteins)  in 
which  a  "leucine  zipper"  (ZIP)  domain  (Landschulz  et  al.,  1988)  is  located  immediately 
C-terminal  to  the  HLH  domain  (Blackwood  and  Eisenman,  1991),  and  provides  a 
critical  contribution  to  dimerization  (Beckmann  and  Kadesch,  1991;  Davis  and 
Halazonetis,  1993;  Ferre-D'  Amare  et  al.,  1993;  Fisher  et  al.,  1991;  Halazonetis  and 
Kandil,  1992;  Ma  et  al.,  1993).  ZIP  domains  form  an  amphipathic  a-helix  that 
dimerizes  as  a  coiled-coil  (O'Shea  et  al.,  1989),  and  thus  also  define  a  separate  family 
of  transcriptional  regulatory  proteins  (the  b-ZIP  proteins)  (Johnson  and  McKnight, 
1989).  bHLH  proteins  must  form  dimers  to  bind  to  DNA  (Davis  et  al.,  1990;  Voronova 
and  Baltimore,  1990),  and  generally  recognize  sites  that  contain  the  palindromic 
consensus  CA  --  TG  (Lassar  et  al.,  1989),  with  each  respective  BR  binding  to  half  of 
the  site  (Blackwell  and  Weintraub,  1990;  Ferre-D'  Amare  et  al.,  1993).  Some  bHLH 
protein  family  members  readily  form  homodimers,  but  others  do  not,  and  appear  to 
require  a  different  dimerization  partner  (Weintraub  et  al.,  1991).  For  example,  while 
Myc  protein  bHLH-ZIP  domains  can  bind  DNA  in  vitro  as  homodimers  (Alex  et  al., 

1992;  Blackwell  et  al.,  1990;  Kerkhoff  et  al.,  1991;  Ma  et  al.,  1993),  they  dimerize  (and 
thus  bind  DNA)  far  more  efficiently  as  heterodimers  with  Max,  a  widely-expressed 
bHLH-ZIP  protein  (Blackwood  and  Eisenman,  1991;  Prendergast  et  al.,  1991)  which  is 
required  for  their  capacity  to  transform  cells,  and  appears  to  be  essential  for  their 
normal  functions  (Amati  et  al.,  1993;  Blackwood  et  al.,  1992;  Kato  et  al.,  1992; 
Mukherjee  et  al.,  1992;  Prendergast  et  al.,  1992;  Wenzel  et  al.,  1991). 
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Figure  1: 

Representative  bHLH  domains  (taken  from  Benezra,  et  al.,  1990).  Conserved  amino  acids  are  shaded. 
Positions  that  correspond  to  the  a  and  d  positions  of  amphipathic  alpha  helices  are  indicated.  Helix- 
disrupting  proline  residues  are  circled.  MyoD  BR  residues  are  numbered  in  the  text  so  that  the  left-most 
shaded  R  residue  corresponds  to  1 . 

By  analogy  to  other  bHLH  proteins,  it  would  be  predicted  that  Myc  proteins 
would  be  involved  in  transcriptional  regulation  (Collum  and  Alt,  1990;  Luscher  and 
Eisenman,  1990).  Recent  experiments  support  this  idea  (Amati  et  al.,  1992;  Amin  et 
al.,  1993;  Gu  et  al.,  1993;  Kato  et  al.,  1990;  Kretzner  et  al.,  1992),  and  have  suggested 
the  following  model:  c-Myc/Max  and  Max/Max  complexes  compete  for  the  same  DNA 
targets,  at  which  c-Myc/Max  activates  and  Max/Max  blocks  transcription  (Amati  et  al., 

1992;  Kretzner  et  al.,  1992).  Direct  repression  can  be  achieved  at  these  sites  by 
binding  of  heterodimers  of  Max  with  the  bHLH-ZIP  proteins  Mad  (Ayer  et  al.,  1993)  or 
Mxi  (Zervos  et  al.,  1993).  In  the  remainder  of  this  report,  this  group  of  proteins  will  be 
referred  to  as  the  Myc/Max/Mad/Mxi  network,  because  they  appear  to  be  linked  in 
function  by  their  abilities  to  interact  with  each  other  and  to  recognize  common  DNA 
sequences.  In  addition  to  its  apparent  function  as  an  activator,  c-Myc  can  inhibit 
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transcriptional  initiation  which  is  mediated  by  the  TFIl-l  protein,  a  finding  which  may 
explain  how  c-Myc  appears  to  repress  transcription  of  particular  genes  (Roy  et  al., 
1993).  In  recent  years,  the  list  of  putative  direct  or  indirect  targets  of  the 
Myc/Max/Mad/Mxi  network  has  been  growing  longer,  and  includes  proteins  associated 
with  cell  division,  such  as  cyclins  and  regulators  of  cyclin-dependent  kinase  activity 
(Bello-Fernandez  et  al.,  1993;  Benvenisty  et  al.,  1992;  Born  et  al.,  1994;  Daksis  et  al., 
1994;  Filers  et  al.,  1991;  Galaktionov  et  al.,  1996;  Grandori  et  al.,  1996;  Hann  et  al., 
1994;  Jansen-Durr  et  al.,  1993;  Jones  et  al.,  1996). 

The  two  c-Myc  regions  which  are  indispensible  for  its  transforming  capability 
(Stone  et  al.,  1987)  were  later  identified  as  its  transcriptional  activator  and  bHLH-ZIP 
domains,  indicating  the  importance  of  its  ability  to  regulate  transcription.  Myc  proteins 
are  remarkably  conserved  in  these  regions  (Schreiber-Agus  et  al.,  1993;  Walker  et  al., 
1992),  and  can  all  bind  to  sites  that  contain  CACGTG  or  CATGTG  core  sequences 
(Alex  et  al.,  1992;  Berberich  et  al.,  1992;  Blackwell  et  al.,  1990;  Kato  et  al.,  1992;  Ma  et 
al.,  1993;  Papoulas  et  al.,  1992),  suggesting  that  they  may  have  similar  or  overlapping 
functions.  However,  a  number  of  related  bHLH  proteins,  including  the  bHLH-ZIP 
transcriptional  regulatory  proteins  USF,  TFE3,  and  TFEB,  can  also  bind  to  the  same 
sequences  (Beckmann  et  al.,  1990;  Carr  and  Sharp,  1990;  Gregor  et  al.,  1990).  All  of 
these  bHLH-ZIP  proteins  contain  in  their  respective  BRs  an  arginine  (R)  residue  (R13) 
which  is  essential  for  recognition  of  these  particular  CA  -  TG  sites  (Blackwell  et  al., 
1993;  Dang  et  al.,  1992;  Halazonetis  and  Kandil,  1992;  Van  Antwerp  et  al.,  1992),  and 
which  directly  contacts  the  central  bases  in  them  (Ferre-D'  Amare  et  al.,  1993).  These 
similarities  in  DMA  recognition  raise  the  issue  of  how  Myc  proteins  and  these  other 
bHLH-ZIP  proteins  might  be  able  to  act  on  different  genes,  and  would  appear  to 
suggest  that  any  differences  in  their  target  specificities  would  necessarily  be 
determined  by  interactions  with  cooperating  factors.  Conversely,  some  differences  in 
DNA  recognition  have  been  identified  among  them  (Blackwell  et  al.,  1993;  Fisher  and 
Coding,  1992;  Halazonetis  and  Kandil,  1991;  Prochownik  and  Van  Antwerp,  1993). 
For  example,  the  ability  to  bind  to  certain  "non-canonical"  sites,  which  are  based  on 
variants  of  the  CA  --  TG  consensus,  is  shared  by  the  Myc/Max/Mad  proteins,  but  not  by 
the  other  related  bHLH-ZIP  proteins,  indicating  that  it  might  confer  some  degree  of 
specificity  and  thus  be  of  biological  significance  (Blackwell  et  al.,  1993).  Such  DNA 
sequences  have  been  found  recently  to  be  associated  with  a  number  of  candidate 
Myc-responsive  genes  (Grandori  et  al.,  1996). 

The  relationship  between  DNA-binding  and  transcriptional  regulation  by  c-Myc 
may  be  complex,  as  is  suggested  by  the  example  of  the  bHLH  protein  MyoD  (Figure 
1).  MyoD  induces  many  cell  types  to  differentiate  into  muscle  (Davis  et  al.,  1987; 
VVeintraub  et  al.,  1989),  and  it  functions  as  a  heterodimer  with  members  of  the  widely- 
expressed  E2A  family  of  bHLH  proteins  (i.  e.  El 2;  Figure  1)  (Lassar  et  al.,  1991;  Murre 
et  al.,  1989).  Mutational  analyses  of  MyoD  and  of  related  bHLH  proteins  have  shown 
that  certain  BR  mutations  allow  them  to  bind  to  appropriate  DNA  sequences  but 
interfere  with  their  ability  to  activate  transcription  or  induce  myogenesis  (Davis  et  al., 
1990;  Davis  and  Weintraub,  1992;  Schwarz  et  al.,  1992;  Weintraub  et  al.,  1991). 

These  findings  suggest  that  the  MyoD  BR  is  involved  in  protein-protein  interactions  as 
well  as  in  binding  to  DNA  (Weintraub  et  al.,  1991).  Such  a  mechanism  (referred  to  as 
"positive  control";  (Hochschild  et  al.,  1983))  has  been  described  in  other  families  of 
DNA-binding  proteins  (Kristie  and  Sharp,  1990;  Lai  et  al.,  1992;  Stern  et  al.,  1989).  In 
the  case  of  MyoD,  it  has  been  proposed  that  appropriate  protein-DNA  and  protein- 
protein  interactions  are  required  for  exposure  of  its  transcriptional  activator  domain, 
which  appears  to  be  "buried"  within  the  protein  when  its  BR  is  not  bound  to  DNA 
(Weintraub  et  al.,  1991).  This  mechanism  could  potentially  contribute  to  target 
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specificity,  if  only  a  subset  of  MyoD  binding  sites  were  to  allow  binding  in  a 
conformation  which  would  permit  these  protein-protein  interactions  to  occur,  and  thus 
were  capable  of  inducing  transcriptional  activation  (Weintraub  et  al.,  1991).  Such 
complex  mechanisms  for  determining  target  specificity  could  potentially  be  utilized  by 
other  bHLH  proteins,  including  those  of  the  Myc  family.  In  fact,  members  of  different 
bHLH  protein  groups,  including  the  Myc  proteins,  are  characterized  by  particular 
amino  acids  in  their  BRs  for  some  of  which  no  direct  role  in  determining  DNA-binding 
specificity  has  yet  been  demonstrated  (Ferre-D'  Amare  et  al,,  1993;  Fisher  et  al.,  1993). 
The  conservation  of  these  amino  acids  suggests  biological  importance,  either  for  as 
yet  undetermined  effects  on  DNA-binding,  for  protein-protein  interactions,  or  both. 

bHLH  protein  structure: 

Recent  insights  into  c-Myc  protein-protein  and  protein-DNA  interactions  present 
the  prospect  that  in  the  future,  antineoplastic  therapeutics  might  be  designed  to 
interfere  with  them  (see  (Perutz,  1992)),  and  thus  block  the  ability  of  c-Myc  to  regulate 
transcription.  The  determination  of  structures  for  bHLH  protein-DNA  complexes 
represents  a  major  step  forward  in  this  direction.  These  efforts  have  shown  that  Max 
forms  a  parallel,  left-handed,  four  helix  bundle  in  which  the  ZIP  domain  continues  C- 
terminally  from  helix  2,  and  the  BR  extends  as  an  a-helix  N-terminally  from  helix  1  as  it 
crosses  the  major  groove  of  B-form  DNA  (Ferre-D'  Amare  et  al.,  1993).  Recently- 
determined  structures  for  complexes  of  the  bHLH  proteins  E47  and  MyoD  (which  lack 
a  ZIP  domain)  bound  to  DNA  has  further  revealed  that  the  configuration  of  the  HLH 
domain  fold  is  remarkably  preserved  between  bHLH  and  bHLH-ZIP  proteins 
(Ellenberger  et  al.,  1994;  Ma  et  al.,  1994).  While  these  structures  have  demonstrated 
how  the  HLH  dimerization  interface  is  formed,  and  have  made  predictions  about 
critical  protein-DNA  contacts  which  can  now  be  tested,  they  also  leave  open  a  number 
of  questions.  For  example,  they  have  not  suggested  roles  for  a  number  of  BR  residues 
which  do  not  contact  bases,  yet  are  conserved  within  different  bHLH  protein  sub¬ 
families  (Benezra  et  al.,  1990),  and  thus  might  be  essential  for  their  function.  It  is  also 
not  clear  why  bHLH-ZIP  proteins  require  the  ZIP  domain  for  dimerization,  or  what 
determines  the  dimerization  specificities  of  HLH  domains.  In  addition,  because  these 
structures  were  determined  using  isolated  bHLH  or  bHLH-ZIP  domains,  they  do  not 
address  potential  interactions  between  them  and  the  remainder  of  these  proteins. 

Significantly,  these  studies  do  provide  an  essential  basis  for  investigating  such 
issues  by  a  program  of  integrated  mutagenesis,  biochemical,  and  molecular  modeling 
experiments.  As  a  part  of  this  research  effort  I  have  undertaken  such  an  effort  in 
collaboration  with  Dr.  Thomas  Ellenberger,  who  is  investigating  bHLH  protein  structure 
by  X-ray  crystallography  and  will  incorporate  our  findings  into  further  structural 
investigations.  Our  goal  is  to  gain  insights  into  the  specificity  of  these  protein-DNA  and 
protein-protein  interactions  that  will  contribute  to  our  understanding  of  the  biology  of  c- 
Myc  and  of  other  bHLH  proteins,  and  that  will  thus  be  an  essential  complement  to 
efforts  underway  in  other  laboratories  to  identify  Myc-responsive  genes.  Results  from 
our  experiments  should  thus  be  of  particular  value  for  future  efforts  at  "rational" 
molecular  design  of  antineoplastic  therapies. 

The  Mastermind  protein.  Notch  signaling,  and  mammary  oncogenesis: 

In  a  change  of  specific  aim  which  has  been  approved  by  the  Army,  our  second 
aim  is  now  a  study  of  DNA  binding  by  the  Mastermind  protein. 

Like  bHLH  proteins,  other  BR-containing  proteins  generally  bind  to  DNA  as 
dimers.  For  example,  the  bZIP  proteins  bind  to  DNA  only  as  dimers,  through  a  distinct 
type  of  BR  which  also  lies  in  the  major  groove  (Ellenberger  et  al.,  1992).  However, 
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one  example  of  monomeric  BR-DNA  binding  has  been  identified,  the  SKN-1  protein  of 
C.  elegans  (Blackwell  et  al.,  1994;  Bowerman  et  al.,  1992).  SKN-1  contains  at  its  C- 
terminus  a  BR  like  that  of  bZIP  proteins,  but  it  lacks  a  dimerization  domain.  Instead,  it 
binds  to  DNA  sequence-specifically  as  a  monomer,  by  means  of  an  85  residue  domain 
that  places  a  flexible  N-terminal  "arm"  into  the  minor  groove  of  an  AT-rich  region,  and 
stabilizes  the  BR  by  means  of  a  predominantly  helical  intervening  region  (Blackwell  et 
al.,  1994).  Only  one  other  example  has  been  identified  of  a  BR  that  lacks  a  ZIP  or  an 
HLH  segment,  the  Mastermind  (Mam)  protein  of  Drosophila  (Smoller  et  al.,  1990). 

Mam  contains  a  BR  (Figure  2),  but  lacks  sequences  that  are  similar  to  either  SKN-1  or 
bZIP  proteins. 

b  ZIP 


K  N  K 
•p  W  -D 


A  A 


Figure  2: 

Alignment  of  the  Mastermind  BR  with  representative  BRs  (Smoller  et  al.,  1990).  Conserved 
residues  are  boxed,  and  indicated  with  arrows. 


Mam  is  relevant  to  breast  cancer  because  it  is  required  for  implementation  of 
signaling  in  response  to  Notch  proteins  (Artavanis-Tsakonas  et  al.,  1995;  Smoller  et 
al.,  1990),  which  have  been  implicated  in  mammary  oncogenesis  (see  below).  Notch 
is  a  transmembrane  protein  which  is  involved  in  numerous  embryonic  signaling 
events,  in  which  cells  are  directed  to  follow  or  to  suppress  programs  of  differentiation. 
Proteins  related  to  Notch  have  been  found  in  organisms  as  diverse  as  C.  elegans  and 
humans,  and  are  utilized  in  myriad  decisions  of  cell  fate  in  the  developing  embryo 
(Artavanis-Tsakonas  et  al.,  1995).  At  least  in  part,  in  both  Drosophila  and  vertebrates, 
the  Notch  signal  appears  to  be  effected  through  transcriptional  activation  by  the 
Suppressor  of  Hairless  (Su(H))  protein,  which  binds  to  regulatory  sequences  at  target 
genes  (Fortini  and  Artavanis-Tsakonas,  1994).  Expression  of  a  truncated  Notch 
protein,  which  lacks  the  extracellular  domain,  results  in  a  constitutive  Notch  signal 
(Artavanis-Tsakonas  et  al.,  1995).  Significantly,  this  Notch  fragment  has  been 
demonstrated  to  associate  with  DNA-bound  Su(H),  and  thus  to  convert  it  to  an 
activator  (Jarriault  et  al.,  1995).  It  has  been  proposed  that  transduction  of  the  Notch 
signal  involves  proteolytic  cleavage  which  liberates  this  Notch  fragment,  and  allows  it 
to  be  translocated  to  the  nucleus  (Jarriault  et  ai.,  1995).  As  Mam  is  a  nuclear  protein,  it 
is  likely  to  be  involved  in  these  activation  events,  or  in  the  functioning  of  gene  products 
that  are  expressed  in  response  to  Notch  signaling. 

Insertion  of  the  mouse  mammary  tumor  virus  (MMTV)  into  the  mouse  int-3  gene, 
a  Notch  family  member,  results  in  generation  of  mammary  cell  tumors  (Robbins  et  al., 
1992;  van  Leeuwen  and  Nusse,  1995).  This  transformation  event  appears  to  be 
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mediated  by  production  of  a  constitutively-active  truncated  lnt-3  protein,  and 
expression  of  such  a  protein  in  mammary  cells  interferes  with  their  differentiation  and 
results  in  their  transformation  (Jhappan  et  al.,  1992;  Smith  et  al.,  1995).  These 
findings  implicate  activation  of  the  Notch  pathway  in  mammary  carcinoma.  Analogous 
activated  Notch  proteins  have  been  linked  to  lymphoid  tumors  (Artavanis-Tsakonas  et 
al.,  1995),  and  have  been  demonstrated  to  cooperate  with  c-Myc  to  induce  thymomas 
(Girard  et  al.,  1996).  An  understanding  of  this  pathway  is  therefore  relevant  to  breast 
cancer,  and  to  cancer  in  general. 

We  will  attempt  to  identify  specific  DNA  sequences  that  are  bound  by  the 
Drosophila  Mastermind  protein,  either  alone,  or  together  with  candidate  cooperating 
co-factors  such  as  Su(H).  These  experiments  will  consist  of  in  vitro  selections  from 
random  sequence  libraries  (Blackwell,  1995),  as  well  as  gel  mobility  shift  assays  using 
sequences  from  candidate  target  genes.  As  these  experiments  yield  results,  we  will 
move  on  to  investigation  of  how  the  newly-described  Mastermind  DNA-binding 
domain  might  recognize  DNA  (and  associated  proteins),  and  perform  cell  culture 
investigations  of  Mastermind  function.  These  experiments  will  also  serve  as  a  basis 
for  future  attempts  to  identify  vertebrate  mastermind  genes. 

BODY: 


A.  Investigation  of  c-Myc  and  bHLH  protein-protein  and  protein-DNA 
interactions. 

1.  Protein-protein  interactions: 

The  dimerization  specificities  of  ZIP  domains  appear  to  be  determined  by 
interactions  between  charged  amino  acids  which  lie  adjacent  to  their  dimerization 
interface  (O'Shea  et  al.,  1992;  Vinson  et  al.,  1993),  and  evidence  suggests  that 
interactions  between  the  ZIP  domains  of  c-Myc  and  Max  follow  similar  principles 
(Amati  et  al.,  1993).  However,  it  is  not  understood  how  the  dimerization  specificities  of 
HLH  domains  are  determined,  nor  is  it  known  why  the  HLH  domains  of  bHLH-ZIP 
proteins  such  as  c-Myc  and  Max  do  not  dimerize  efficiently,  so  that  they  generally 
require  the  ZIP  domain  (see  above).  To  address  these  issues,  we  have  begun  to  use 
bHLH  protein  structures  that  were  derived  by  X-ray  crystallography  as  a  starting  point 
for  mutational  analyses  and  molecular  modeling  experiments. 

Although  in  bHLH-ZIP  proteins  the  HLH  domain  is  not  sufficient  to  mediate 
dimerization,  its  integrity  appears  to  be  required  for  dimer  formation  (Davis  and 
Halazonetis,  1993;  Reddy  et  al.,  1992),  and  it  is  critical  for  orientation  of  the  BRs 
(Ferre-D'  Amare  et  al.,  1993).  The  HLH  domain  does  not  follow  the  paradigm 
represented  by  the  ZIP  domain,  in  which  hydrophobic  residues  that  are  present  at 
positions  a  and  d  in  the  helix  form  a  dimerization  interface,  with  the  remaining  residues 
generally  being  polar  (see  (Ellenberger  et  al.,  1992);  Figure  1).  Instead,  in  the  HLH 
domain  many  of  the  residues  at  the  g  and  e  positions  are  hydrophobic,  especially  in 
helix  2,  and  the  dimerization  interface  is  in  fact  a  core  between  the  four  helices,  which 
is  shielded  from  solvent  exposure  (Ferre-D'  Amare  et  al.,  1993);  Ellenberger,  et  al.,  in 
preparation).  The  structure  determined  for  Max  homodimers  does  not  suggest  an 
obvious  explanation  for  how  HLH  dimerization  specificities  might  be  determined,  but 
by  comparing  it  with  other  bHLH  structures,  it  should  be  possible  to  formulate  testable 
hypotheses. 

For  example,  the  bHLH  protein  E47  forms  dimers  with  relatively  high  affinity 
(Sun  and  Baltimore,  1991).  The  E47-DNA  complex  structure  which  was  determined 
by  X-ray  crystallography  (Ellenberger  et  al.,  1994)  has  revealed  that,  relative  to  Max, 
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E47  is  characterized  by  an  additional  interaction  between  charged  residues  across 
dimer  subunits.  This  interaction  appears  to  occur  between  a  histidine  (H)  residue  at 
the  end  of  helix  1  and  a  glutamic  acid  (E)  residue  near  the  end  of  helix  2  on  the 
opposite  subunit.  It  involves  an  increase  in  the  length  of  helix  1 ,  and  appears  to  be 
potentially  able  to  contribute  significantly  to  dimerization  (Ellenberger  et  al.,  1994). 
The  structures  suggest  that  this  interaction  is  possible  because  E47  lacks  a  particular 
tyrosine  (Y)  residue  which  is  present  within  helix  2  in  many  bHLH  proteins,  including 
bHLH-ZIP  proteins  (Figure  1),  and  which  seems  to  present  steric  constraints  that 
prevent  helix  1  from  extending  as  far  as  in  E47. 

In  collaboration  with  Dr.  Ellenberger,  we  have  begun  to  test  whether  this 
interaction  is  critical,  by  substituting  the  apparently  relevant  residues  from  E2A 
proteins  into  MyoD,  which  forms  homodimers  poorly  (Sun  and  Baltimore,  1991). 
During  the  first  project  year,  we  created  a  series  of  E2A/MyoD  swap  mutants,  of  which 
MD/EA'QH2VE  (Figure  3)  contained  the  most  E2A  residues,  and  would  have  been 
predicted  by  modeling  to  undergo  the  interaction  described  above.  Surprisingly,  this 
protein  bound  DNA  at  a  level  slightly  lower  than  wild  type.  During  the  past  year  we 
created  the  mutant  MD/E2A'QH2VE  (Figure  3),  in  which  the  entire  E47  loop  region 
was  substituted  into  MyoD.  This  protein  also  formed  dimers  with  an  affinity  that  was 
only  approximately  the  same  as  wild  type.  These  results  indicated  that  these 
substitutions  did  not  allow  the  interaction  described  above  to  take  place,  or  perhaps 
that  the  binding  energy  derived  from  it  was  overcome  by  negative  effects  associated 
with  combining  these  particular  MyoD  and  E2A  residues. 

basic  helixl  loop  helix2 


MyoD 


108 


125 


137  146 


166 


MD/EA'QHZVE 


S135  V158  E163 


/  \ 

TSSHLKSNQR 


MD/E2/YQH2VE 


SI 35  VI 58  El 63 


SQMHLKSDKAQT 


Figure  3: 

MyoD  HLH  mutations.  The  basic  region,  helices  1  and  2,  and  the  loop  region 
are  indicated  by  differently  shaded  boxes.  Site-directed  substitution  mutants  are 
indicated  by  the  standard  one-letter  amino  acid  code.  Numbering  is  according  to  the 
sequence  of  full-length  MyoD.  Substituted  residues  are  indicated  in  bold  type. 

2.  Protein-DNA  interactions: 

Fortunately,  our  investigations  of  bHLH  protein-DNA  interactions  have  met  with 
more  success.  We  have  begun  by  looking  at  the  bHLH  protein  MyoD,  the  functionai 
capabilities  of  which  have  been  studied  most  extensively  (Davis  and  Weintraub, 
1992).  We  are  investigating  how  certain  residues  affect  binding  preferences  at 
positions  internal  to  and  flanking  the  CA  --  TG  consensus,  in  particular  those 
implicated  in  "positive  control."  We  are  analyzing  these  mutants  by  the  selection  and 
and  amplification  of  binding  sites  (SAAB)  technique  of  in  vitro  nucieic  acid  selection, 
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coupled  with  a  pooled  sequencing  assay  (Blackwell  and  Weintraub,  1990).  In  a 
relatively  rapid  fashion,  we  can  thus  assay  the  consequences  of  mutations  on  DNA 
binding,  both  by  selecting  preferred  sequences  and  by  selecting  for  pools  of 
sequences  to  which  these  complexes  will  n^  bind.  We  are  then  confirming  the  results 
of  these  SAAB  assays  by  analyzing  binding  of  mutants  to  individual  oligonucleotides. 
These  results  are  expected  to  lead  to  modeling  studies,  which  will  be  employed  to 
design  future  mutagenesis  efforts  that,  in  the  case  of  particularly  definitive  mutants,  can 
be  subjected  to  crystallographic  analysis  by  Dr.  Ellenberger. 

Various  tissue-specific  bHLH  proteins,  such  as  MyoD,  function  as  heterodimers 
with  E2A  proteins  (Murre  et  al.,  1989).  These  different  heterodimer  combinations  can 
then  bind  to  different  versions  of  the  CA  -  TG  consensus;  for  example,  MyoD/E2A 
proteins  bind  to  sites  with  a  CACCTG  core  (Blackwell  and  Weintraub,  1990),  and  our 
experiments  have  now  determined  that  heterodimers  of  MyoD  with  the  bHLH  protein 
Twist,  which  is  involved  in  mesoderm  specification  (see  (Michelson,  1996))  bind  to 
CATATG  sites  (Figure  4).  The  observation  that  this  preference  is  different  over  the 
entire  site,  and  not  over  just  one  half,  suggests  that  different  E2A  partners  might 
recognize  different  sequences  by  positioning  both  bound  basic  regions  differently  on 
the  DNA.  Substitution  of  the  BR  from  El 2  (an  E2A  protein.  Figure  1)  for  that  of  MyoD 
(E12basic/MyoD;  Figure  4)  results  in  a  protein  that  will  bind  DNA  with  close  to  wild 
type  affinity  as  a  heterodimer  with  E2A,  but  will  not  induce  myogenesis  (Davis  et  al., 
1990;  Davis  and  Weintraub,  1992).  Remarkably,  a  homodimer  of  this  protein  binds 
preferentially  to  a  CATATG  site  that  is  identical  to  the  preferential  E2A/Twist  (or 
Twist/Twist)  recognition  site  (Figure  4).  Back-substitution  of  the  MyoD  As  and  Te 
residues  into  E12basic/MyoD  restores  the  ability  to  induce  myogenesis  (Davis  et  al., 
1990;  Davis  and  Weintraub,  1992),  and  we  have  now  shown  that  a  homodimer  of  this 
back-substituted  protein  preferentially  recognizes  a  MyoD  consensus  (Figure  4).  This 
last  result  is  striking  in  that  "positive  control"  mutations  have  been  identified  at  these 
residues.  Our  results  suggest  that,  although  genetic  evidence  indicates  that  these 
residues  affect  protein-protein  interactions,  they  influence  protein-DNA  contact 
throughout  the  site.  Indeed,  in  the  MyoD  structure  obtained  by  Xray  crystallography, 
these  residues  are  oriented  so  that  they  point  directly  into  the  major  groove  (Ma  et  al., 
1994).  The  most  straightforward  interpretation  of  our  results  is  that  these  residues  are 
involved  in  basic  region  positioning,  which  is  in  turn  involved  in  positive  control.  In  this 
interpretation,  recognition  of  the  same  site  by  E12basic/MyoD  and  Twist  is  not  a 
coincidence,  but  arises  from  analogous  positioning  of  critical  residues. 

An  additional  MyoD  residue  of  biological  importance  is  K15  which,  when 
substituted  into  E12  along  with  A5  and  Te,  confers  the  ability  to  induce  myogenesis 
(Davis  and  Weintraub,  1992).  This  K  residue  is  located  within  the  BR-helix  1  junction 
(Figures  1  and  4).  Our  experiments  indicate  that  a  heterodimer  of  E12basic/MyoD  with 
MyoDbasic/E12  binds  preferentially  to  the  "Twist"  CATATG  site  (Figure  4),  again 
indicating  mis-positioning  of  the  basic  regions.  This  mis-positioning  appears  to  be 
corrected  when  both  corresponding  junctions  are  also  substituted  (E12basic-J/MD  and 
MDbasic-J/E12;  Figure  4).  These  findings  indicate  a  pivotal  role  for  the  junctions  in 
positioning  the  BRs  and  suggest,  provocatively,  that  the  critical  K  residue  might  be 
involved.  They  are  also  consistent  with  the  notion  that  the  protein-protein  interactions 
that  are  implied  to  involve  the  MyoD  BR  might  depend  on  proper  positioning  of  these 
BRs  in  the  major  groove,  not  on  protein-protein  interactions  involving  the  "positive 
control"  residues  directly. 

We  have  now  begun  to  test  this  hypothesis  by  looking  at  DNA  binding  by 
additional  mutant  versions  of  MyoD.  Substitution  of  multiple  different  amino  acids  into 
the  As  and  Te  positions  results  in  molecules  that  lose  discrimination  among  CA  -  -TG 
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A.  BASIC  REGIONS: 
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B.  BINDING  PREFFRFNCFS; 
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F  I  gure  4; 

DNA  binding  site  preferences  of  the  indicated  bHLH 
proteins.  In  A.  residues  conserved  among  all  bHLH  proteins 
are  shaded.  Numbering  i s  as  in  the  text.  Brackets  indicate 
residues  that  were  substituted  from  the  indicated  proteins. 
Amino  acids  that  are  shared  with  MyoD  are  underlined  in  the 
other  bHLH  proteins.  In  B,  the  CANNTG  consensus  is 
indicated  in  bold  type.  Bases  that  are  selected  against  are 
indicated  by  underlining. 
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sites  (not  shown).  Consequently,  we  are  taking  the  approach  of  mutating  non- 
essential  BR  amino  acids  to  alanine  (Fisher  et  al.,  1993),  then  swapping  in 
combinations  of  critical  residues  from  MyoD,  E2A,  Twist,  and  other  bHLH  proteins.  By 
this  approach,  we  will  determine  whether  the  "positioning"  effects  we  have  observed 
above  can  be  pinpointed  to  those  residues.  These  mutants  are  under  construction  by 
a  research  technician,  Thip  Kophengnavong.  We  expect  that  these  experiments  can 
be  extended  to  include  molecular  modeling,  and  to  address  recognition  of  "non- 
canonical"  sites  by  Myc-family  bHLH  proteins  (Blackwell  et  al.,  1993). 

We  have  also  attempted  to  identify  a  binding  site  for  a  novel  Max  dimerization 
partner,  pi 8  (R.  Eisenman,  unpublished).  This  bHLH-ZIP  protein  dimerizes  well  with 
Max,  but  is  not  a  member  of  either  the  Myc  or  Mad  families.  Heterodimers  of  Max  and 
pi  8  do  not  bind  well  to  CACGTG  sites,  suggesting  that  they  may  have  a  novel  binding 
specificity.  So  far,  we  have  not  been  successful  in  identifying  a  pi  8  recognition  site. 

Through  a  collaboration,  we  also  developed  a  system  for  in  vivo  selection  of 
regulatory  sequences  that  respond  to  a  given  transcription  factor  (Huang  et  al., ). 
Sequences  that  allow  transcriptional  activation  by  MyoD  were  selected  from  a  random 
sequence  library,  which  had  been  cloned  into  a  promoter  in  place  of  a  required  MyoD 
binding  site.  This  promoter  library  was  placed  upstream  of  a  p-gal  reporter,  allowing 
FACS  selection  of  cells  that  harbored  active  plasmids.  Three  rounds  of  selection  were 
performed,  each  of  which  involved  co-transfection  of  the  library  DNA  with  a  MyoD 
expression  vector,  followed  by  FACS  selection  of  cells  that  received  an  "active"  MyoD- 
responsive  construct,  then  expansion  of  the  selected  DNA  in  E.  coli.  Remarkably,  the 
selected  functional  sequences  represented  only  a  subset  of  the  allowed  MyoD  binding 
sites,  and  in  this  system  the  "best"  MyoD/E2A  binding  sites  were  inactive.  These 
findings  suggest  that  either  binding  in  an  appropriate  conformation,  or  binding  to  a 
particular  sequence,  may  be  required  for  transcriptional  activation  by  MyoD.  These 
results  are  of  particular  interest  in  light  of  the  importance  of  BR  positioning  that  is 
implied  by  the  experiments  described  above.  It  will  be  of  significant  interest  to 
investigate  the  basis  for  this  finding,  and  to  determine  whether  such  mechanisms 
might  be  characteristic  of  other  bHLH  proteins. 

B.  Investigation  of  DNA  binding  by  the  Mastermind  protein: 

In  our  second  specific  aim,  we  had  originally  proposed  to  use  in  vitro  selection 
to  isolate  single  stranded  nucleic  acid  molecules  (aptamers)  that  could  bind  to  c-Myc 
and  inhibit  its  dimerization  or  DNA  binding  (Ellington  and  Szostak,  1990).  Since  the 
original  proposal  was  submitted,  it  has  become  apparent  that  this  technology  is  being 
pursued  vigorously  by  numerous  biotechnology  companies,  many  of  which  have 
chemistry  departments  that  can  readily  synthesize  a  variety  of  modified  nucleotides 
that  can  be  used  in  these  experiments,  in  light  of  the  number  and  breadth  of  those 
efforts,  I  have  chosen  to  devote  my  Army  Breast  Cancer  Program  award  strictly  to  basic 
research,  and  have  begun  to  investigate  DNA  binding  by  the  Mastermind  protein.  It  is 
hoped  that  this  research  will  provide  novel  insights  into  Notch  function,  and  thus  into  a 
pathway  that  is  linked  to  mammary  carcinoma. 

Ms  Kophengnavong  has  begun  performing  in  vitro  selections  for  Mam  binding 
sites,  using  a  GST  fusion  protein  that  contains  a  fragment  of  Mam  that  includes  its  BR. 
So  far,  we  have  determined  that  this  Mam  fragment  binds  DNA  non-specifically  at  an 
affinity  of  approximately  100  nM,  an  observation  which  is  a  promising  indicator  that  it 
may  function  as  a  DNA  binding  protein.  We  are  continuing  our  selections  to  search  for 
a  specific  Mam  binding  site. 
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CONCLUSIONS: 


Our  investigations  of  bHLH  protein  dimerization  are  as  yet  incomplete,  because 
the  mutagenesis  performed  so  far  has  yielded  inconclusive  results.  However,  the 
structural  evidence  for  the  model  under  investigation  is  compelling.  Bearing  this  in 
mind,  we  will  continue  to  pursue  our  test  of  this  hypothesis  by  substituting  successively 
larger  regions  of  E47  into  MyoD.  However,  we  have  prioritized  our  DNA-binding 
experiments,  which  have  met  with  considerably  more  success. 

Our  studies  of  bHLH  protein-DNA  binding  indicate  the  novel  finding  that  BR 
positioning  may  underlie  recognition  of  different  CA  --  TG  sites  by  different  dimers  of 
E2A  and  its  partners,  an  issue  that  has  till  now  remained  a  mystery.  More  importantly, 
this  phenomenon  may  be  linked  to  the  biological  activity  of  these  proteins.  For  now  we 
will  concentrate  on  MyoD,  the  biological  roles  of  which  have  been  more  widely  studied 
than  those  of  the  Myc  proteins,  and  on  other  E2A  partners.  We  will  next  expand  these 
experiments  to  address  how  Myc  proteins  recognize  the  non-canonical  sequences. 

The  system  that  we  have  developed  for  in  vivo  selection  of  functional  binding 
sites  (Huang  et  al.,  1996)  will  prove  useful  in  a  variety  of  areas,  because  it  allows 
identification  and  comparison  of  regulatory  sites  that  respond  to  a  particular  protein  in 
the  context  of  different  promoters  and  co-factors.  It  may  ultimately  prove  helpful  for 
investigations  of  how  Myc  can  activate  some  genes  and  repress  others. 

As  indicated  above,  our  finding  that  Mam  binds  to  DNA  at  least  non-specifically 
is  encouraging,  and  is  consistent  with  the  idea  that  it  functions  as  a  DNA  binding 
protein.  These  experiments,  if  successful,  will  fill  in  an  important  missing  link  in  the 
Notch  pathway,  which  has  been  linked  to  mammary  carcinoma.  They  will  also  open 
the  possibility  of  using  biochemical  strategies  to  isolate  vertebrate  Mam  homologs. 
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